Bootstrapping polarity classifiers with rule-based classification

نویسندگان

  • Michael Wiegand
  • Manfred Klenner
  • Dietrich Klakow
چکیده

In this article, we examine the effectiveness of bootstrapping supervised machine-learning polarity classifiers with the help of a domain-independent rulebased classifier that relies on a lexical resource, i.e., a polarity lexicon and a set of linguistic rules. The benefit of this method is that though no labeled training data are required, it allows a classifier to capture in-domain knowledge by training a supervised classifier with in-domain features, such as bag of words, on instances labeled by a rule-based classifier. Thus, this approach can be considered as a simple and effective method for domain adaptation. Among the list of components of this approach, we investigate how important the quality of the rule-based classifier is and what features are useful for the supervised classifier. In particular, the former addresses the issue in how far linguistic modeling is relevant for this task. We not only examine how this method performs under more difficult settings in which classes are not balanced and mixed reviews are included in the data set but also compare how this linguistically-driven method relates to stateof-the-art statistical domain adaptation.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Bootstrapping Supervised Machine-learning Polarity Classifiers with Rule-based Classification

In this paper, we explore the effectiveness of bootstrapping supervised machine-learning polarity classifiers using the output of domain-independent rule-based classifiers. The benefit of this method is that no labeled training data are required. Still, this method allows to capture in-domain knowledge by training the supervised classifier on in-domain features, such as bag of words. We investi...

متن کامل

دسته‌بندی پرسش‌ها با استفاده از ترکیب دسته‌بندها

Question answering systems are produced and developed to provide exact answers to the question posted in natural language. One of the most important parts of question answering systems is question classification. The purpose of question classification is predicting the kind of answer needed for the question in natural language. The  literature works can be categorized as rule-based and learning...

متن کامل

FBM-Yahoo! at RepLab 2012

This paper describes FBM-Yahoo!’s participation in the profiling task of RepLab 2012, which aims at determining whether a given tweet is related to a specific company and, in if this being the case, whether it contains a positive or negative statement related to the company’s reputation or not. We addressed both problems (ambiguity and polarity reputation) using Support Vector Machines (SVM) cl...

متن کامل

SUBCLASS FUZZY-SVM CLASSIFIER AS AN EFFICIENT METHOD TO ENHANCE THE MASS DETECTION IN MAMMOGRAMS

This paper is concerned with the development of a novel classifier for automatic mass detection of mammograms, based on contourlet feature extraction in conjunction with statistical and fuzzy classifiers. In this method, mammograms are segmented into regions of interest (ROI) in order to extract features including geometrical and contourlet coefficients. The extracted features benefit from...

متن کامل

Bootstrapping Sentiment Labels For Unannotated Documents With Polarity PageRank

We present a novel graph-theoretic method for the initial annotation of high-confidence training data for bootstrapping sentiment classifiers. We estimate polarity using topic-specific PageRank. Sentiment information is propagated from an initial seed lexicon through a joint graph representation of words and documents. We report improved classification accuracies across multiple domains for the...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Language Resources and Evaluation

دوره 47  شماره 

صفحات  -

تاریخ انتشار 2013